4 Making Maps

“Maps invest information with meaning by translating it into visual form.” — Susan Schulten

How the BBC uses R. https://medium.com/bbc-visual-and-data-journalism/how-the-bbc-visual-and-data-journalism-team-works-with-graphics-in-r-ed0b35693535

4.1 Maps with tmap

So far we’ve used the plot() method as a first-order check on our geocomputations. It is possible to make a publishable quality map using plot() but there is a lot of trial and error in getting it to the quality needed.

An efficient alternative is to use functions from the tmap package. There are other packages for making nice maps in R with some of them listed in the syllabus but I like the tmap package because it integrates sf, sp, and raster objects seamlessly.

library(tmap)

Like ggplot2, tmap uses the ‘grammar of graphics’. The grammar separates the input data frame from the aesthetics (how data are visualised). The functions translate the data into aesthetics. The aesthetics can include the location on a geographic map (defined by the geometry), color, and other visual components.

All tmap maps start with the tm_shape() function that takes as input a spatial data frame. The function is followed by one or more layers such as tm_fill() and tm_dots() that defines how a property in the data gets translated to an aesthetic.

Consider the New Zealand simple feature data frame (nz) from the spData package. Make the data frame available, check its class, and make a plot of the geometry.

library(spData)

class(nz)
## [1] "sf"         "data.frame"
plot(nz$geom)

The geometry column is labeled geom.

Check the native coordinate reference system (CRS) of the spatial data frame with the st_crs() function from the sf package.

library(sf)

st_crs(nz)
## Coordinate Reference System:
##   User input: EPSG:2193 
##   wkt:
## PROJCS["NZGD2000 / New Zealand Transverse Mercator 2000",
##     GEOGCS["NZGD2000",
##         DATUM["New_Zealand_Geodetic_Datum_2000",
##             SPHEROID["GRS 1980",6378137,298.257222101,
##                 AUTHORITY["EPSG","7019"]],
##             TOWGS84[0,0,0,0,0,0,0],
##             AUTHORITY["EPSG","6167"]],
##         PRIMEM["Greenwich",0,
##             AUTHORITY["EPSG","8901"]],
##         UNIT["degree",0.0174532925199433,
##             AUTHORITY["EPSG","9122"]],
##         AUTHORITY["EPSG","4167"]],
##     PROJECTION["Transverse_Mercator"],
##     PARAMETER["latitude_of_origin",0],
##     PARAMETER["central_meridian",173],
##     PARAMETER["scale_factor",0.9996],
##     PARAMETER["false_easting",1600000],
##     PARAMETER["false_northing",10000000],
##     UNIT["metre",1,
##         AUTHORITY["EPSG","9001"]],
##     AUTHORITY["EPSG","2193"]]

The CRS is called the New Zealand transverse Mercator. The distance unit is meter (https://epsg.io/2193).

To make a tmap map we first identify the spatial data frame with the tm_shape() function and then add a borders layer.

tm_shape(nz) +
  tm_borders() 

The borders separate New Zealand into 16 administrative regions.

The function tm_shape() and its subsequent drawing layers (here tm_borders()) as a ‘group’. The data in the tm_shape() function must be a spatial object of class simple feature, raster, or an S4 class spatial object.

Here we use a fill layer instead of the borders layer.

tm_shape(nz) +
  tm_fill() 

Here we layer using the fill aesthetic and then add a border aesthetic.

tm_shape(nz) +
  tm_fill(col = 'green') +
  tm_borders() 

Layers are added with the + operator and are functionally equivalent to an overlay.

We can assign the resulting map to an object. For example here we assign the map of New Zealand to the object map_nz.

map_nz <- tm_shape(nz) + 
  tm_polygons()

class(map_nz)
## [1] "tmap"

The resulting object is of class tmap.

New spatial data are added with + tm_shape(new_object). In this case new_obj represents a new spatial data frame to be plotted over the preceding layers. When a new spatial data frame is added in this way, all subsequent aesthetic functions refer to it, until another spatial data frame is added.

For example, let’s add an elevation layer to the New Zealand map. The elevation raster (nz_elev) spatial data frame is in the spDataLarge package on GitHub.

The install_github() function from the devtools package is used to install a package on GitHub. GitHub is a company that provides hosting for software development version control using Git. Git is a distributed version-control system for tracking changes in code during software development.

Note, I’ve done this for you below (Do not run the code chunk below).

library(devtools)
install_github("Nowosad/spDataLarge")

Make the data available.

library(spDataLarge)

Next identify the spatial data for the the new layer by adding tm_shape(map_nz). Then add the raster layer with the tm_raster() function and set the transparency level to 70% (alpha = .7).

map_nz1 <- map_nz +
  tm_shape(nz_elev) + 
    tm_raster(alpha = .7)

map_nz1

The new map object map_nz1 builds on the existing map object map_nz by adding the raster layer nz_elev representing elevation.

We can create new layers with functions. For instance, a function like st_union() operates on the geometry column of a simple feature data frame.

As an example, here we create a line string layer as a simple feature object using three geocomputation functions. We start by creating a union over all polygons (regions) with the st_union() function applied to the nz simple feature object. The result is a multipolygon defining the coastlines.

Then we buffer this multipolgyon out to a distance of 22.2 km using the st_buffer() function. The result is a single polygon defining the coastal boundary around the entire country.

Finally we change the polygon geometry to a line string geometry with the st_cast() function. The default in st_cast() is to simplify the geometry (e.g., a LINESTRING is simpler than a POLYGON).

The operations can be linked together with the pipe (%>%) from the dplyr package.

library(dplyr)

nz_water <- st_union(nz) %>% 
  st_buffer(22200) %>% 
  st_cast(to = "LINESTRING")
class(nz_water)
## [1] "sfc_LINESTRING" "sfc"

Now we add the resulting spatial object as a layer to our map.

map_nz2 <- map_nz1 +
  tm_shape(nz_water) + 
    tm_lines()

map_nz2

Finally, lets create a layer representing the country elevation high points (stored in the object nz_height) onto the map_nz2 object with tm_dots() function.

map_nz3 <- map_nz2 +
  tm_shape(nz_height) + 
    tm_dots()

map_nz3

4.1.1 Example 1: What state contains the geographic centroid of tornado activity?

Import the tornado data. Remove Alaska, Hawaii, and Puerto Rico tornadoes. The native CRS is latitude/longitude so we transform it to a Web Mercator (EPSG 3857) used by Google Maps, OpenStreetMap, Bing, ArcGIS, ESRI.

Alltors.sfdf <- read_sf(dsn = "1950-2018-torn-initpoint") %>%
                  filter(!st %in% c("HI", "AK", "PR")) %>%
                  filter(yr >= 1994) %>%
                  st_transform(crs = 3857)

What is the geometry type in the simple feature column (sfc)?

st_geometry(Alltors.sfdf)
## Geometry set for 30497 features 
## geometry type:  POINT
## dimension:      XY
## bbox:           xmin: -13855710 ymin: 2820573 xmax: -7548241 ymax: 6331042
## CRS:            EPSG:3857
## First 5 geometries:
## POINT (-9625796 3665259)
## POINT (-9075878 3214975)
## POINT (-9085897 3246452)
## POINT (-9176066 3480433)
## POINT (-9170500 3493271)

The start location of the tornado track has a POINT geometry. The CRS specified by the EPSG number specifies a particular proj4string.

We use the st_combine() function to create a single feature column with a MULTIPOINT geometry. All the genesis locations are geometrically combined into a single MULTIPOINT simple feature (no attributes).

single <- st_combine(Alltors.sfdf)
st_geometry(single)
## Geometry set for 1 feature 
## geometry type:  MULTIPOINT
## dimension:      XY
## bbox:           xmin: -13855710 ymin: 2820573 xmax: -7548241 ymax: 6331042
## CRS:            EPSG:3857
## MULTIPOINT ((-9625796 3665259), (-9075878 32149...

Next we use the st_centroid() function to find the geographic center of the simple feature column.

center <- st_centroid(single)

The result is a single feature of type POINT. In this case the order of operations is cummutative, but it is not in general.

To create a map, we first get a simple feature data frame (geometry type: MULTIPOLYGON) of the continental U.S. state borders from the spData package.

library(USAboundaries)

States <- us_states() %>%
  filter(!state_name %in% c("Alaska", "Hawaii", "Puerto Rico", "District of Columbia"))
st_crs(States)
## Coordinate Reference System:
##   User input: EPSG:4326 
##   wkt:
## GEOGCS["WGS 84",
##     DATUM["WGS_1984",
##         SPHEROID["WGS 84",6378137,298.257223563,
##             AUTHORITY["EPSG","7030"]],
##         AUTHORITY["EPSG","6326"]],
##     PRIMEM["Greenwich",0,
##         AUTHORITY["EPSG","8901"]],
##     UNIT["degree",0.0174532925199433,
##         AUTHORITY["EPSG","9122"]],
##     AUTHORITY["EPSG","4326"]]
st_geometry(States)
## Geometry set for 48 features 
## geometry type:  MULTIPOLYGON
## dimension:      XY
## bbox:           xmin: -124.7258 ymin: 24.49813 xmax: -66.9499 ymax: 49.38436
## CRS:            EPSG:4326
## First 5 geometries:
## MULTIPOLYGON (((-68.92401 43.88541, -68.87478 4...
## MULTIPOLYGON (((-114.7997 32.59362, -114.8094 3...
## MULTIPOLYGON (((-94.61792 36.49941, -94.3612 36...
## MULTIPOLYGON (((-75.77379 39.7222, -75.75323 39...
## MULTIPOLYGON (((-85.60516 34.98468, -85.47434 3...

CAUTION: The spData package contains a spatial data frame for getting state-level boundaries with the same name (us_states)!!

We note that the CRS of the state borders does not match the CRS of the tornadoes so we use the st_transform() function with the argument crs = st_crs(Alltors.sfdf).

States <- st_transform(States, 
                       crs = st_crs(Alltors.sfdf))

Which state contains the centroid? The function st_contains() returns a TRUE for the state that contains the center.

mtrx <- st_contains(States, 
                    center, 
                    sparse = FALSE)

The result is a 48 x 1 matrix with all entries FALSE except the state containing the centroid.

Select the geometry containing the centroid.

centerState <- States[mtrx[, 1], ]
plot(centerState$geometry)

Then to make a map using the functions from the tmap package we start with the state borders group, then add the tornadoes as a group with a dot layer, then add the state borders subsetted by the state containing the centroid, finally add the centroid location as a group with bubbles as a layer.

tm_shape(States) +
  tm_borders() +
tm_shape(Alltors.sfdf) +
  tm_dots() +
tm_shape(States[mtrx, 1]) +
  tm_borders(col = "red") +
tm_shape(centerState) +
  tm_bubbles(size = .3, col = "red")

Note that we need to think about the order of the layers.

4.1.2 Example 2: State-level tornado rates

Here we compute and display the annual tornado occurrence rate (per 10,000 square km) for each state.

Start by first determining the intersections of the state polygons and the tornado points.

mtrx <- st_contains(States, 
                    Alltors.sfdf, 
                    sparse = FALSE)
dim(mtrx)
## [1]    48 30497

The result is a 48 (states) by 30,497 (tornadoes) matrix of logical values indicating whether or not each tornado occurred within each state.

Next we use the rowSums() function to get the total number of TRUE entries for each states.

rowSums(mtrx)
##  [1]   54  101  984   14  831 1054  225 1374  100 1387 1252  636  978  224  346
## [16] 1186  193  208   71  750 3497  462  577  697   54  619   38 1200 1254  217
## [31]  759    5  470 1603  622 1110 2181   45   46   68   60  263 1271   22   45
## [46]  780  400   12

The key is that the order of the elements from the rowSums() function matches the order of the states in States.

Also see the aggregate() function (KBDI.Rmd).

So we include the counts as a column in the States simple feature data frame. We include the area of each state. This allows us to compute the rate per 10,000 sq. km. The spatial unit is meter so areas are in sq. meter. To change square meter to square kilometer we multiply by 10 billion (10^10).

nT <- rowSums(mtrx)
StateArea <- st_area(States)

States <- States %>%
  mutate(nT,
         rate = nT/StateArea/(2018 - 1994 + 1) * 10^10)
head(States)
## Simple feature collection with 6 features and 14 fields
## geometry type:  MULTIPOLYGON
## dimension:      XY
## bbox:           xmin: -12781060 ymin: 3550008 xmax: -7452828 ymax: 6340332
## CRS:            EPSG:3857
##   statefp  statens    affgeoid geoid stusps      name lsad        aland
## 1      23 01779787 0400000US23    23     ME     Maine   00  79885221885
## 2      04 01779777 0400000US04    04     AZ   Arizona   00 294198560125
## 3      05 00068085 0400000US05    05     AR  Arkansas   00 134771517596
## 4      10 01779781 0400000US10    10     DE  Delaware   00   5047194742
## 5      13 01705317 0400000US13    13     GA   Georgia   00 149169848456
## 6      27 00662849 0400000US27    27     MN Minnesota   00 206232257655
##        awater state_name state_abbr jurisdiction_type
## 1 11748755195      Maine         ME             state
## 2  1027346486    Arizona         AZ             state
## 3  2960191698   Arkansas         AR             state
## 4  1398720828   Delaware         DE             state
## 5  4741100880    Georgia         GA             state
## 6 18929176411  Minnesota         MN             state
##                         geometry   nT               rate
## 1 MULTIPOLYGON (((-7672586 54...   54 0.12242390 [1/m^2]
## 2 MULTIPOLYGON (((-12779442 3...  101 0.09306545 [1/m^2]
## 3 MULTIPOLYGON (((-10532819 4...  984 1.91609715 [1/m^2]
## 4 MULTIPOLYGON (((-8435099 48...   14 0.63421637 [1/m^2]
## 5 MULTIPOLYGON (((-9529523 41...  831 1.53764865 [1/m^2]
## 6 MULTIPOLYGON (((-10823487 6... 1054 0.91800271 [1/m^2]

Then we make a map. Here since all the information is in a single spatial data frame States we only have one group. The group has two layers; borders and fill. The fill layer has the color aesthetic pointed to the column labeled ‘rate’.

tm_shape(States) +
  tm_borders(col = "gray70") +
  tm_fill(col = "rate",
          title = "Annual Rate\n[/10,000 sq. km]") +
  tm_layout(legend.outside = TRUE)

Note: functions from tmap (1) expect data as spatial objects rather than data frames and (2) variable names need to be surrounded by quotes.

All functions start with tm_. The first function in a group needs to be tm_shape(). It specifies the spatial object. Functions following tm_shape() specify sequential aesthetics as layers. Layers are divided into base and derived.

Base layers

  • tm_polygons(): Draws polygons; col
  • tm_symbols(): Draws symbols; size, col, shape
  • tm_lines(): Draws polylines; col, lwd
  • tm_raster(): Draws a raster; col
  • tm_text(): Add text labels; text, size, col

Derived layers

  • tm_fill(): Fills the polygons; see tm_polygons()
  • tm_borders(): Draws polygon borders; none
  • tm_bubbles(): Draws bubbles; see tm_symbols()
  • tm_squares(): Draws squares; see tm_symbols()
  • tm_dots(): Draws dots; see tm_symbols()
  • tm_markers(): Draws markers; see tm_symbols() and tm_text()
  • tm_iso() Draws iso/contour lines; see tm_lines() and tm_text()

Each aesthetic (layer) can take a constant value or a data variable name. For instance, tm_fill(col = 'green') colors all polygons green, while tm_fill(col = "var1"), where "var1" is the name of a data variable in the shape object, creates a choropleth. If a vector of constant values or variable names are provided, the output is a set of maps.

The following layers are cartographic elements:

  • tm_grid(): Add coordinate grid lines
  • tm_credits(): Add credits text label
  • tm_compass(): Add map compass
  • tm_scale_bar(): Add scale bar

For example here we create a choropleth map of countries using an index of happiness.

data(World)

tm_shape(World) +
    tm_polygons(col = "HPI")

The simple feature spatial object World is the only required argument in the tm_shape() function. A polygon layer is added where the fill color (col =) is set to the column HPI in the attribute table of the simple feature data frame.

The legend title is set with the title = argument. For example, here we create a title with the expression() and paste() functions then use the title in the tm_fill() function. Here we also use tm_fill() plus tm_borders() rather than the tm_polygons() as the layer.

legend_title <- expression(paste("Area (km", {}^2, ")"))
tm_shape(nz) +
  tm_fill(col = "Land_area", 
          title = legend_title) +
  tm_borders()

The default are sensible breaks. The argument breaks = sets the breaks manually. The argument n = sets the number of bins into which numeric variables are categorized. The palette = argument defines the color scheme, for example BuGn from the RColorBrewer package.

4.1.3 Map layout, facets, and inserts

Layout functions help create a cartographic map. Elements include the title, the scale bar, margins, aspect ratios, etc. For example, here elements such as a north arrow and a scale bar are added with tm_compass() and tm_scale_bar(), respectively and the tm_layout() function is used to add the title and background color.

map_nz + 
  tm_compass(type = "8star", 
             position = c("left", "top")) +
  tm_scale_bar(breaks = c(0, 100, 200), 
               text.size = 1) +
  tm_layout(title = "New Zealand",
            bg.color = "lightblue")

Faceted maps (referred to as ‘small multiples’) are composed of several maps arranged side-by-side. Facets enable the visualization of how spatial relationships change with respect to another variable.

For example, here the faceted variable is time (year). The simple feature data frame is from the spData package. We first filter the data frame keeping only the years 1970, 1990, 2010, and 2030.

urb_1970_2030 <- urban_agglomerations %>% 
  filter(year %in% c(1970, 1990, 2010, 2030))

Note: The operator %in% acts like a recursive or. If year == 1970 or year == 1990, … For example,

1969:2031 
##  [1] 1969 1970 1971 1972 1973 1974 1975 1976 1977 1978 1979 1980 1981 1982 1983
## [16] 1984 1985 1986 1987 1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998
## [31] 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013
## [46] 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 2024 2025 2026 2027 2028
## [61] 2029 2030 2031
1969:2031 %in% c(1970, 1990, 2010, 2030)
##  [1] FALSE  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [13] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE  TRUE FALSE FALSE
## [25] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [37] FALSE FALSE FALSE FALSE FALSE  TRUE FALSE FALSE FALSE FALSE FALSE FALSE
## [49] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [61] FALSE  TRUE FALSE

We then make a map.

tm_shape(world) + 
  tm_polygons() + 
tm_shape(urb_1970_2030) + 
  tm_symbols(col = "black", 
             border.col = "white",
             size = "population_millions") +
  tm_facets(by = "year", 
            nrow = 4, 
            free.coords = FALSE)

The above code chunk demonstrates key features of faceted maps created with functions from the tmap package.

  • Shapes that do not have a facet variable are repeated (the countries in world in this case).
  • The by = argument which varies depending on a variable (year in this case).
  • nrow/ncol setting specifying the number of rows (and columns) that facets should be arranged into.
  • The free.coords = argument specifies whether each map has its own bounding box.

Small multiples are also generated by assigning more than one value to one of the aesthetic arguments. For example here we map the happiness index (HPI) and gross domestic product per capita (gdp_cap_est).

tm_shape(World) +
    tm_polygons(c("HPI", "gdp_cap_est"), 
        style = c("pretty", "kmeans"),
        palette = list("RdYlGn", "Purples"),
        title = c("Happy Planet Index", "GDP per capita")) 

Two maps are created each with a different scale. All arguments of the layer functions can be vectorized, one for each small multiple map. Arguments that normally take a vector, such as palette =, are placed in a list().

Multiple map objects can also be arranged in a single plot with the tmap_arrange() function.

map1 <- tm_shape(World) +
           tm_polygons("HPI", 
                       style = "pretty",
                       palette = "RdYlGn",
                       title = "Happy Planet Index") 

map2 <- tm_shape(World) +
           tm_polygons("gdp_cap_est", 
                       style = "kmeans",
                       palette = "Purples",
                       title = "GDP per capita") 

tmap_arrange(map1, map2)

4.1.4 Example 3: Tornado locations by month

Suppose we want a map showing the location of tornadoes by month. First add the world map with countries as a polygon layer (fill is gray by default). Then overlay the U.S. country polygons and color them white. Then add the tornado genesis locations as dots faceted by month. Then add the state polygons as borders. Use a North American Lambert Azimuthal Equal Area Projection and make this the master layer. Finally add cartographic elements.

tm_shape(world) +
  tm_polygons() +
tm_shape(world[world$name_long == "United States", ]) +
  tm_polygons(col = "white") +
tm_shape(Alltors.sfdf) +
  tm_dots() + 
  tm_facets(by = "mo", as.layers = TRUE) +
tm_shape(States, is.master = TRUE) + 
  tm_borders() +
  tm_compass(type = "arrow", position = c("left", "bottom")) +
  tm_scale_bar(position = c("left", "bottom"), text.size = .75) +
  tm_style("natural") +
  tm_layout(main.title = "Contiguous U.S. Tornadoes [1950-2018]",
            main.title.position = "center", main.title.size = .85,
            panel.labels = month.name) +
  tm_credits(c(rep("", 11), "Data Source: U.S. SPC"), 
             position = c("right", "bottom"))

We can clearly see the spread of tornado activity from the southeast in January to the central Plains in May to the northern Plains during July and August and then back to the southeast in December.

4.1.5 Inset map

An inset map is a smaller map set within or next to the main map. An inset map is used to contextualize the geographic study area. Here we create a map of the central part of New Zealand’s Southern Alps. The inset map shows where the main map is in relation to all of New Zealand.

The first step is to define the area of interest. Here it is done here by creating a new spatial object nz_region using the st_bbox() function and the st_as_sfc() to make it a simple feature column.

nz_region <- st_bbox(c(xmin = 1340000, xmax = 1450000,
                       ymin = 5130000, ymax = 5210000),
                     crs = st_crs(nz_height)) %>% 
  st_as_sfc()

The second step is to create a base map showing New Zealand’s Southern Alps area. This is the closeup view of where the most important message is stated. The region is clipped to the simple feature column nz_region created above. The layers include a raster of elevations and locations of high points. A scale bar is included.

nz_height_map <- tm_shape(nz_elev, 
                          bbox = nz_region) +
  tm_raster(style = "cont", 
            palette = "YlGn", 
            legend.show = TRUE) +
  tm_shape(nz_height) + 
  tm_symbols(shape = 2, 
             col = "red", 
             size = 1) +
  tm_scale_bar(position = c("left", "bottom"))

nz_height_map

The third step is to create the inset map. It gives a context and helps to locate the area of interest. This map clearly indicates the location of the main map.

nz_map <- tm_shape(nz) + 
  tm_polygons() +
  tm_shape(nz_height) + 
  tm_symbols(shape = 2, 
             col = "red", 
             size = .1) + 
  tm_shape(nz_region) + 
  tm_borders(lwd = 3)

nz_map

The final step is to combine the two maps. The viewport() function from the grid package is used to give a center location (x and y) and the size (width and height) of the inset map.

library(grid)

nz_height_map
print(nz_map, 
      vp = viewport(.8, .27, width = .5, height = .5))

Additional details are available here: https://geocompr.robinlovelace.net/adv-map.html

4.1.6 Turn a static map into an interactive map

Leaflet is a popular open-source platform (javascript) for making/serving interactive maps. The leaflet package provides access to the platform.

As an example, here we create a world map by using the default mapping arguments in leaflet() and zoomable tiles with the addTiles() layer. The piping operator (%>%) is used to ‘add’ layers.

library(leaflet)

m <- leaflet() %>% 
  addTiles()
m

Note: if the map doesn’t render in your Viewer tab, then click the Viewer tab and select “Show in new window”.

Here we use setView() with longitude and latitude arguments in decimal degrees to center the map on FSU and use a zoom of 17.

m <- m %>%
  setView(lng = -84.29849, 
          lat = 30.44188, 
          zoom = 17)
m

A cool feature of tmap is that we can create an interactive map using the same code as we used to create a static map.

For example our static map of New Zealand (map_nz) is viewed interactively by switching to view mode.

tmap_mode("view")
## tmap mode set to interactive viewing
map_nz

With the interactive mode turned on, all maps produced with tmap will launch as zoomable HTML. This feature includes the ability to specify the basemap with tm_basemap() (or tmap_options()) as demonstrated here.

map_nz + 
  tm_basemap(server = "OpenTopoMap")

Q: Why is there no topography for New Zealand?

We can also create interactive maps with the tmap_leaflet() function.

The view mode in tmap works with faceted plots. The argument sync in tm_facets() is used to produce multiple maps with synchronized zoom and pan settings.

world_coffee <- left_join(world, 
                          coffee_data, 
                          by = "name_long")
tm_shape(world_coffee) + 
  tm_polygons(c("coffee_production_2016", 
                "coffee_production_2017")) + 
  tm_facets(nrow = 1, sync = TRUE)

Change the view mode back to plot.

tmap_mode("plot")
## tmap mode set to plotting

4.1.7 Example 4: Tornado occurrences on a grid

Import the tornado data convert it to geographic coordinates.

Alltors2.sfdf <- st_read(dsn = "1950-2018-torn-aspath") %>%
                       filter(!st %in% c("HI", "AK", "PR")) %>%
                       filter(yr >= 2014 & mag >= 0)
## Reading layer `1950-2018-torn-aspath' from data source `/Users/jelsner/Desktop/Projects/ASSFG-Book/1950-2018-torn-aspath' using driver `ESRI Shapefile'
## Simple feature collection with 63645 features and 22 fields
## geometry type:  LINESTRING
## dimension:      XY
## bbox:           xmin: -163.53 ymin: 18.13 xmax: -64.9 ymax: 61.02
## CRS:            4326

Create the leaflet map with a view set on campus. First use the addTiles() function and then the addPolylines() function with the argument data = Alltors2.sfdf. The piping operator is used to add layers.

m <- leaflet() %>% 
  setView(-84.29849, 30.44188, zoom = 17) %>%
  addTiles() %>%
  addPolylines(data = Alltors2.sfdf)
m

We can also use the mapView() function from the mapview package. It also provides interactivity for easy and quick visualization during spatial data analysis. But it is not intended for fine-tuned presentation quality map production.

library(mapview)

mapView(Alltors2.sfdf, 
        zcol = "mag")

Use mapView() to create an interative map of tornado counts on a raster. First create a raster of grid cells at a resolution of a quarter of a degree latitude and longitude. Use the rasterize() function from the raster package to count the number of tornado occurrences in each cell.

We first create the raster with a cell resolution of 1/4 degree. We specify field = "om" so the result is a raster layer rather than a raster brick (all variables in Alltors2.sfdf).

library(raster)

r <- raster(xmn = -125, xmx = -67, 
            ymn = 24, ymx = 50)
res(r) <- .25
Tor_r <- rasterize(x = Alltors2.sfdf, 
                   y = r, 
                   field = "om",
                   fun = "count")

Finally use the mapView() function to create the interactive map.

mapView(Tor_r,
        alpha.regions = .35)

4.2 Maps with ggplot

The package ggplot2 supports sf objects with the function geom_sf(). The syntax is similar to that used by tmap functions. An initial ggplot() function followed by one or more layers, that are added with +. The layers begin with geom_.

For example, here we plot a choropleth map of the median income by region across New Zealand and add a layer indicating elevation heights.

library(ggplot2)

ggplot() + 
  geom_sf(data = nz, aes(fill = Median_income)) +
  geom_sf(data = nz_height) +
  scale_x_continuous(breaks = c(170, 175))

The function geom_sf() uses the geometry column of the simple feature data frame. It does not work with S4 spatial data frames.

The function automatically plots graticules (lines of latitude and longitude) labels. The default settings for the graticules can be overridden using scale_x_continuous(), scale_y_continuous() or coord_sf(datum = NA).

An advantage of using functions from the ggplot2 package for mapping is that the package has a strong user-community and there are many add-on packages. And they have just tripled the speed of this function that will be available with the next version of ggplot2

The maps (and plots) from ggplot2 are given a level of interactivity when printed using the function ggplotly() from the plotly package. Try plotly::ggplotly(g1), for example.

g1 <- ggplot() + 
  geom_sf(data = nz, aes(fill = Median_income)) +
  geom_sf(data = nz_height) +
  scale_x_continuous(breaks = c(170, 175))
plotly::ggplotly(g1)

But ggplot2 has a few drawbacks for making maps. The geom_sf() function does not always produce a nice legend. Raster objects are not supported and need to be converted into a data frame before plotting.

4.3 Cartograms

A cartogram is a map where the geometry is proportionately distorted to represent a variable. We can create cartograms with functions in the cartogram package.

The cartogram_cont() function creates continuous area cartograms. It accepts an sf object and name of the variable (column) as inputs. Additionally, the maximum number of iterations for the transformation can be set with the intermax = argument.

For example, here we represent median income in New Zealand’s regions as a continuous cartogram/

library(cartogram)

nz_carto <- cartogram_cont(nz, 
                           weight = "Median_income", 
                           itermax = 5)
## Mean size error for iteration 1: 2.55875465805575
## Mean size error for iteration 2: 1.79272875527739
## Mean size error for iteration 3: 1.43598715415039
## Mean size error for iteration 4: 1.25491843756565
## Mean size error for iteration 5: 1.15398221468619
tm_shape(nz_carto) + 
  tm_polygons("Median_income")

We create non-contiguous cartograms using cartogram_ncont() and Dorling cartograms using cartogram_dorling(). Non-contiguous area cartograms are created by scaling down each region based on a weighting variable.

States <- spData::us_states

States2163 <- st_transform(States, 
                           crs = 2163)
States2163_ncont <- cartogram_ncont(States2163, 
                                    "total_pop_15")
tm_shape(States2163_ncont) +
  tm_polygons("total_pop_15")

Dorling cartograms consist of circles with areas proportional to the weighting variable.

States2163_dorling <- cartogram_dorling(States2163, 
                                        "total_pop_15")
tm_shape(States2163_dorling) +
  tm_polygons("total_pop_15")

Challenge The file sids2.zip contains shapefiles indicating the number and rate (per 10,000 births) of sudden infant death syndrome (SIDS) by county in North Carolina. Load the data as a simple features spatial object and make small multiple choropleth maps of the SIDS rates in 1974 and 1979 (SID74 and SID79 variables) using functions from the tmap package. Create a continuous cartogram of the SIDS rates in 1979.

Import the data.

download.file("http://myweb.fsu.edu/jelsner/temp/data/sids2.zip",
              "sids2.zip")
unzip("sids2.zip")
sids.sf <- read_sf(dsn = "sids2")
st_crs(sids.sf) <- 4326
sids.sf <- st_transform(sids.sf, crs =  3857)

Small multiple maps

tm_shape(sids.sf) +
  tm_polygons(c("SID74", "SID79"))

Cartogram

sids_carto <- cartogram_dorling(sids.sf, "SID79")
tm_shape(sids_carto) + 
  tm_polygons("SID79",
              title = "Per 10K live births") +
  tm_compass(type = "8star", 
             position = c("right", "bottom"))

4.4 Coordinate systems

This needs to be changed to reflect the new WKT proj strings, etc.

See

A Coordinate Reference System (CRS) defines how the spatial elements of the data relate to the surface of the Earth. CRSs are either geographic or projected.

Geographic coordinate systems identify any location on the Earth’s surface using longitude and latitude. Longitude is location in the east-west direction in angular distance from the Prime Meridian. Latitude is angular distance north or south of the equator. Distance in geographic CRSs are not measured in meters.

The surface of the Earth in geographic coordinate systems is represented by a spherical or ellipsoidal surface. Spherical models assume that the Earth is a perfect sphere of a given radius. Spherical models have the advantage of simplicity but are rarely used because they are inaccurate. Ellipsoidal models are defined by two parameters: the equatorial radius and the polar radius.

Ellipsoids are part of a wider component of CRSs: the datum. This contains information on what ellipsoid to use (with the ellps parameter in the proj4 CRS library) and the precise relationship between the cartesian coordinates and location on the Earth’s service. These additional details are stored in the towgs84 argument of proj4 notation (see proj4.org/parameters.html for details). These allow local variations in Earth’s surface, e.g. due to large mountain ranges, to be accounted for in a local CRS.

There are local and geocentric datums. In a local datum such as NAD83 the ellipsoidal surface is shifted to align with the surface at a particular location. In a geocentric datum such as WGS84 the center is the Earth’s center of gravity and the accuracy of projections is not optimized for a specific location. Available datum definitions are seen by typing

st_proj_info(type = "datum")

4.4.1 Projected coordinate systems

Projected CRSs are based on Cartesian coordinates on a flat surface. They have an origin, x and y axes, and a unit of distance (meter). All projected CRSs are based on a geographic CRS, described in the previous section, and rely on map projections to convert the three-dimensional surface of the Earth into Easting and Northing (x and y) values in a projected CRS.

This transition cannot be done without adding some distortion. Therefore, some properties of the Earth’s surface are distorted in this process, such as area, direction, distance, and shape. A projected coordinate system can preserve only one or two of those properties. Projections are often named based on a property they preserve: equal-area preserves area, azimuthal preserve direction, equidistant preserve distance, and conformal preserve local shape.

There are three main groups of projection types - conic, cylindrical, and planar. In a conic projection, the Earth’s surface is projected onto a cone along a single line of tangency or two lines of tangency. Distortions are minimized along the tangency lines and increase with the distance from those lines in this projection. Therefore, it is the best suited for maps of mid-latitude areas. A cylindrical projection maps the surface onto a cylinder. This projection could also be created by touching the Earth’s surface along a single line of tangency or two lines of tangency. Cylindrical projections are used most often when mapping the entire world. A planar projection projects data onto a flat surface touching the globe at a point or along a line of tangency. It is typically used in mapping polar regions. A list of projection types is available by typing

st_proj_info(type = "proj")

4.4.2 CRSs in R

We describe a CRS in R with an epsg code or with a proj4string definition. An epsg code is shorter, and therefore easier to remember. The code also refers to only one, well-defined coordinate reference system. On the other hand, a proj4string definition allows more flexibility. This way you can specify many different projections, and modify existing ones. This also makes the proj4string approach more complicated. The epsg code points to one and only one CRS.

Spatial packages support a wide range of CRSs and they use the long-established proj4 library. Other than searching for EPSG codes online, another quick way to find out about available CRSs is via the rgdal::make_EPSG() function, which outputs a data frame of available projections. Before going into more detail, it’s worth learning how to view and filter them inside R, as this could save time on the internet. The following code will show available CRSs interactively, allowing you to filter ones of interest (try filtering for the OSGB CRSs for example):

crs.data <- rgdal::make_EPSG()
View(crs.data)

With simple features the CRS of an object can be retrieved using st_crs(). For example:

filepath <- system.file("vector/zion.gpkg", package = "spDataLarge")
sf.poly <- st_read(filepath)

Our new object is a polygon representing the borders of Zion National Park (?zion).

st_crs(sf.poly) # get CRS

In cases when a coordinate reference system (CRS) is missing or the wrong CRS is set, the st_set_crs() function can be used:

sfdf <- st_set_crs(sf.poly, 26912) # set CRS

The warning message informs us that the st_set_crs() function does not transform data from one CRS to another.

The projection() function can be used to access CRS information from a Raster* object:

projection(new.raster) # get CRS

The same function, projection(), is used to set a CRS for raster objects. The main difference, compared to vector data, is that raster objects only accept proj4 definitions:

projection(new.raster) = "+proj=utm +zone=12 +ellps=GRS80 +towgs84=0,0,0,0,0,0,0 +units=m +no_defs" # set CRS

4.4.3 Units

An important feature of CRSs is that they contain information about spatial units. Clearly it is vital to know whether your house is measured in feet or meters. The same applies to maps. It is good cartographic practice to add a scale bar onto maps to demonstrate the relationship between distances on the page or screen and distances on the ground. Likewise, it is important to formally specify the units in which the geometry data or pixels are measured to provide context, and ensure that subsequent calculations are done in context.

A novel feature of geometry data in simple feature objects is that they have native support for units. This means that distance, area and other geometric calculations return values that come with a units attribute, defined by the units package (Pebesma, Mailund, and Hiebert 2016). This is advantageous because it prevents confusion caused by the fact that different CRSs use different units (most use meters, some use feet). Furthermore, it also provides information on dimensionality, as illustrated by the following calculation which reports the area of Nigeria:

library(spData)
nigeria <- world[world$name_long == "Nigeria", ]
st_area(nigeria)

The result is in units of square meters (square meters), showing a) that the result represents two-dimensional space and b) and that Nigeria is a large country. This information, stored as an attribute is advantageous for many reasons, for example it could feed into subsequent calculations such as population density.

Reporting units prevents confusion. To take the Nigeria example, if the units remained unspecified, one could incorrectly assume that the units were in square kilometers. To translate the huge number into a more digestible size, it is tempting to divide the results by a million (the number of square meters in a square kilometer):

st_area(nigeria) / 10^6

However, the result is incorrectly given again as square meters. The solution is to set the correct units with the units package:

units::set_units(st_area(nigeria), km^2)

Units are of equal importance in the case of raster data. However, so far the sf package is the only spatial package that supports units, meaning that people working on raster data should approach changes in the units of analysis (for example, converting pixel widths from imperial to decimal units) with care. The new.raster object (see above) uses a UTM projection with meters as units. Consequently, its resolution is also given in meters but you have to know it, since the res() function simply returns a numeric vector.

res(new.raster)

If we use the WGS84 projection, the units change.

repr <- projectRaster(new.raster, crs = "+init=epsg:4326")
res(repr)

Again, the res() command gives back a numeric vector without any unit, forcing us to know that the unit of the WGS84 projection is decimal degrees.